Test-time scaling

See also Test time training.

Muennighoff2025s1 showed that by injecting “wait” token and prolonging the “thinking” process, you can improve the performance of LLMs.